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Abstract 


An  algorithm  is  presented  that  aids  in  deciding 
whether  a  sample  is  from  a  single  population  or  a 
mixture  of  two  populations.  It  is  a  combination  of  two 
established  algorithms,  the  EM  algorithm  and  the  minimi¬ 
zation  of  AIC.  When  tested  on  simulated  data  the 
algorithm  performed  well. 

Description  of  Problem 

Research  into  the  development  of  improved  antifouling  shipbottom  coatings 
makes  extensive  use  of  static  panel  immersion  tests  to  screen  developmental 
materials.  A  currently  used  procedure  is  to  expose  twelve  to  twenty  10"  by  12” 
test  panels  coated  with  the  developmental  coating  at  one  or  more  of  the  Navy's 
exposure  sites  (Miami,  FL,  Half  Moon  Bay,  CA  and  Nawiliwili,  HI).  These  panels 
are  evaluated  on  a  quarterly  basis  to  estimate  the  amount  of  fouling  as  a 
measure  of  the  effectiveness  of  the  coating.  These  life  experiments  can  require 
very  long  periods  of  time  to  complete,  usually  three  or  more  years.  The  quarterly 
evaluations  are  reported  to  DTNSRDC,  where  it  is  desirable  to  make  judgments 
regarding  the  progress  of  the  experiment.  This  report  suggests  a  method  of 
analysis  of  the  interim  data,  which  would  aid  in  judging  the  progress  of  the 
experiment.  It  has  been  observed  by  Becka  [1983]  that  the  fouling  times  for 
some  samples  of  antifouling  coatings  exhibit  a  bimodal  distribution,  i.e.  a 
distribution  with  two  local  maxima.  It  is  believed  that  this  can  be  explained 
by  viewing  the  sample  as  coming  from  two  different  populations,  rather  than  the 
usual  view  that  the  sample  comes  from  a  single  population.  For  example,  a 
sample  of  twelve  panels  may  consist  of  five  panels  fouling  at  a  different  rate 
than  the  other  seven  panels.  Of  course,  at  the  beginning  of  the  experiment  it 
was  believed  that  all  panels  were  identical.  Thus  they  should  represent  a 


single  population.  It  is  only  after  the  experiment  has  progressed  for  some 
time  that  two  clusters  of  panels  may  become  apparent,  one  cluster  having  sub¬ 
stantially  more  fouling  than  the  other.  The  problem  to  be  analyzed  in  this 
report  is  twofold.  First,  is  there  sufficient  evidence  to  believe  that  the 
sample  has  items  drawn  from  two  populations  or  just  one?  Second,  how  many  and 
which  panels  belong  to  the  two  populations? 

Bimodal  Distributions 

It  will  be  assumed  that  the  data  consists  of  a  sample  of  M  values  of  the 
variable  FL,  where  FL  is  the  value  report  from  the  exposure  site.  It  is 
generally  a  percentage,  between  100  and  60,  rating  the  amount  that  the  panel 
is  not  fouled.  Generally  conclude  that  if  the  FL' s  are  a  sample  from  two 
populations,  then  the  largest  K  are  from  one  population  and  the  remaining 
(smallest)  M-K  are  from  the  other  population;  K  is  to  be  determined  and  may 
be  zero  or  M.  Especially  when  K  is  small,  e.g.  102  of  M,  one  should  be 

careful  to  not  conclude  that  there  are  two  populations  without  further 

/ 

investigation. 


Several  phenomena  complicate  the  investigation.  Among  these  is  the  fact 
that  random  samples  can  exhibit  quite  large  variations.  It  is  possible  that 
there  will  be  outliers  even  when  there  is  only  one  population.  An  outlier  is 
an  observation  at  a  great  distance  from  the  expected  fouling  level.  The 
statistical  modeling  of  the  fouling  process  can  be  very  sensitive  to  extreme 
values.  If  the  outlier  is  truly  from  the  population  being  studied,  then  it 
is  important  to  include  it,  since  it  contains  information  not  included  in  the 


remainder  of  the  sample.  Hence,  it  is  important  to  physically  examine  outliers 


to  determine  if  they  are  in  fact  special  or  simply  a  manifestation  of  the 
randomness  of  the  sampling  process.  On  the  other  hand,  models  that  employ 
estimates  assuming  that  the  data  is  a  sample  from  a  single  population  should 
be  used  with  care  when  the  sample  appears  to  have  come  from  two  populations. 

Further  complicating  the  analysis  are  masking  and  swapping.  These  terms 
are  used  to  describe  the  problems  arising  from  the  fact  that  there  is  a  gray 
area  between  two  populations.  As  an  illustration  consider  the  histogram  of 
fouling  levels  FL  in  Figure  1. 

★ 
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100  95  90  85  80  *  Fouling  Level 

Figure  1 

The  data  point  at  95  is  masked  by  the  data  points  at  100,  in  that  they  cause 
this  point  to  appear  to  be  part  of  the  leftmost  cluster.  Of  course,  it  is 
masked,  perhaps  more  strongly,  by  the  data  points  at  90,  85  and  80.  It  is 
also  possible  that  one  of  the  points  at  100  belongs  to  the  cluster  on  the 
right  rather  than  on  the  left.  It  has  been  swapped.  It  is  unlikely  that  any 
statistical  analysis  will  completely  sort  out  these  kinds  of  problems.  It 
can  however  call  them  to  the  investigator's  attention. 

A  common  technique  for  modeling  samples  from  two  populations  is  to  use 
a  mixture  of  two  distributions.  Suppose  f(x;P)  is  the  probability  function 
describing  one  of  the  two  populations,  where  P  is  a  vector  of  parameters  and  x 
is  the  value  of  the  variable,  in  this  setting  FL;  and  g(x;Q)  is  the  probability 
function  describing  the  other  population.  The  mixing  proportion,  MixProp, 


is  a  number  between  0  and  1  and 

MixProp.f (x;P)  +  ( 1-MixProp) .g(x;Q)  EQ.  1 

is  the  probability  function  describing  a  mixture  of  the  two  populations.  Each 
item  in  a  sample  of  size  M  from  the  mixture  can  be  thought  of  as  coming  from 
the  population  f(x;P)  with  probability  MixProp.  Said  another  way,  in  a  sample 
of  size  M  one  expects,  on  the  average,  for  K*=M. MixProp  of  the  items  in  the 
sample  to  have  come  from  the  population  f(x;P).  Hence,  to  decide  if  there  are 
two  populations  represented  in  a  sample,  it  is  sufficient  to  estimate  the 
parameter  MixProp. 

As  already  noted  the  phenomenon  under  investigation  was  first  observed  by 
Becka  [1983]  as  a  bimodal  distribution.  She  observed  that  the  reported  data 
had  a  histogram  similar  to  the  one  in  Figure  1.  It  is,  of  course,  not  necessarily 
the  case  that  a  mixture  of  two  distributions  is  bimodal.  Consider,  for  example, 
a  mixture  of  two  binomial  distributions 

MixProp. bin(x;p0,N)  +  ( 1-MixProp) * bin( x; pj , N)  EQ.  2 

where 

bin(x;p,N)  -  (N!/[x!( N-x) ! ] )px( l-p)n-x  for  X“0,...,N  EQ.  3 

is  the  usual  binomial  density  with  N  trials  and  probability  of  success  p.  It 
can  be  shown  that  if  po«0.95,  p^-0.8  and  N-20,  then  MixProp  must  be  between  0.296 
and  0.347  in  order  for  the  mixture  in  EQ.  2  to  be  bimodal.  See  Table  1  for  the 
interval  of  MixProp  that  will  give  a  bimodal  distribution  for  various  values  of 
p0  and  p^.  Even  though  the  mixing  of  two  populations  was  first  observed  in  the 
bimodal  case,  it  is  now  apparent  that  a  test  for  mixing  rather  than  for  bimodality 


is  needed 


Table  1 


INTERVAL  FOR  MIXING  PROPORTION 
THAT  RESULTS  IN  A  BIMODAL  DISTRIBUTION 

Po 


0.95 

0.9 

0.85 

0.65 

(.041, .98) 

(.19, .656) 

(.402,-434) 

0.7 

(.10, .868) 

(.315, .456) 

never  bimodal 

0.75 

(.195, .547) 

never  bimodal 

never  bimodal 

0.8 

(.296, .347) 

never  bimodal 

never  bimodal 

Discussion  of  General  Methodology 


The  statistical  literature  is  replete  with  discussions  and  suggestions 
for  determining  both  outliers  and  mixing  proportions;  for  a  survey  see 
Beckman  and  Cook  [1983].  TWo  methods  standout  for  applications  of  the  type 
required  in  this  report.  They  are  the  EM  algorithm,  Dempster,  et .  al . 
[197/],  and  AIC  minimization,  Akaike  [1977].  A  general  discussion  of  these 
methods  as  they  apply  to  the  present  context  follows.  The  EM  algorithm  will 
be  discussed  first,  since  it  is  needed  to  compute  AIC. 


The  maximum  likelihood  principle  Is  essential  to  both  methods.  It  is 
the  naive  notion  that  given  a  collection  choices  one  should  choose  the  one 
that  is  most  likely.  In  many  common  situations  there  Is  a  closed  form  for 
the  maximum  likelihood  estimator  of  the  unknown  parameter.  There  is  not  a 
closed  form  solution  for  the  unknown  parameter,  MixProp,  in  our  setting, 
however.  The  EM  algorithm  provides  an  iterative,  numerical  method  for 
approximating  the  unknown  parameters.  For  the  model  which  is  a  mixture  of 
two  binomials,  EQ.  2,  and  a  sample  xi,....,xm. 

Pj  =  (yx-ibinCx^OldPj  ,N)  /mixbln(x^) )  / (NM)  j  =  l,2  EQ.  4 


MixProp  “  OldMixProp*  (^(bin(x^;pQ  ,N) /mixbin(x^) ) ) /M 


EQ.  5 


m 

m 


where  mixbin(x)  is  the  mixture  in  EQ.  2  and  both  sums  run  from  i*l  to  M. 


m 

|L™ 


This  algorithm  necessarily  converges.  However,  it  may  be  slow  and  it  may 
converge  to  a  local  extreme  rather  than  the  absolute  maximum,  if  the  starting 
point  is  not  carefully  selected.  This  later  difficulty  usually  can  be  avoided 
by  selecting  several  starting  points  and  selecting  the  maximum  that  is  the 
largest  among  those  generated  by  the  various  starting  points. 

The  EM  algorithm  can  be  applied  in  this  setting  as  follows.  Apply  the 

algorithm  for  MixProp=K/M  for  K*1 . M  where  M  is  the  sample  size.  For  each 

of  these  estimates  compute  the  likelihood.  Actually  it  is  easier,  and  more 
common  to  compute  the  negative  of  the  natural  logarithm  of  the  likelihood: 

-£ln(raixbin(xi;p0,p^,MixProp,N)  EQ.  6 

where  the  sum  runs  from  i=l  to  i=M  and  mixbin( x^ ; pD ,p^ .Mix Prop ,N)  is  the  mixture 
of  two  binomials  as  in  EQ.  2.  Select  the  estimate  of  MixProp  that  yields  the 
largest  likelihood  (smallest  negative  In  likelihood).  When  using  this  procedure 
the  estimate  of  MixProp  will  usually  not  be  of  the  form  K/M.  That  is  MixProp*M, 
which  should  be  interpreted  as  the  number  of  sample  units  from  the  population 
with  parameter  p0  will  not  necessarily  be  a  whole  number.  For  example, 
the  statistical  analysis  may  report  that  5.7  of  the  sample  units  are  from  one 
population  and  the  other  10.3  are  from  the  other  population.  One  is  reminded 
that  masking  and  swapping  are  present.  When  applied  to  simulated  data  this 
procedure  works  well,  but  can  be  improved  by  using  the  AIC  minimization  principle 
described  next. 


The  problem  of  selecting  the  "best"  model  from  several  competing  models  is 
a  common  problem.  Akaike  [1977]  suggested  employing  the  principle  of  minimizing 
the  negative  entropy  in  the  selection  process.  For  a  specific  model  define  its 
AIC  by 


6 


AIC  =  -2 *ln( maximum  likelihood) 

+  2* (number  of  independently  adjusted  parameters).  EQ.  6 

To  apply  this  principle  in  the  present  setting,  select  the  model  with  the 
smallest  AIC  from  among  the  models 

MixProp.bin( x; pQ ,N)  +  (1-MixProp) *bin(x;p^ ,N)  EQ.  7 

where  MixProp»K/M  indexes  the  models  for  K“1,...,M.  That  is,  index  the  models 
under  consideration  by  the  number  of  sample  units  from  the  population  with 
parameter  p0.  Hence  there  are  M  models  to  select  from.  For  K  =  1,..,M-1 
there  are  two  independently  adjusted  parameters,  p0  and  pj_.  For  K=M  there  is 
only  one  parameter,  namely  P0  since  all  sample  units  are  from  one  population. 

This  procedure  differs  from  the  procedure  using  the  EM  algorithm  in  two 
respects.  The  values  of  MixProp/M  =  K  are  restricted  to  being  whole  numbers. 
More  significantly,  observe  that  if  it  were  not  for  the  second  term  in  the 
expression  for  AIC,  namely  2*  (number  of  independently  adjusted  parameters), 
then  minimizing  AIC  is  equivalent  to  maximizing  the  likelihood.  This  extra 
term  in  AIC  results  in  a  preference  for  selecting  the  model  that  says  the  data 
is  from  a  single  population.  Akaike  [1977]  claims  that  in  fact  AIC  corrects 
for  a  bias  in  the  maximum  likelihood  principle  that  causes  a  model  with  fewer 
parameters  to  be  rejected  too  often. 

The  algorithm  based  on  AIC  has  been  found  to  be  sensitive  to  the  estimate 
of  the  maximum  likelihood.  In  particular,  if  the  maximum  likelihood  estimates 
of  the  parameters  p0  and  p^  not  accurately  made,  then  the  results  of  minimizing 
AIC  can  be  quite  unsatisfactory.  The  values  of  AIC  differ  very  little  from  one 
model  to  the  next.  There  are  no  tables  of  the  probability  distribution  of 


AIC,  hence  is  it  difficult  to  judge  whether  snail  differences  are  significant 


or  not.  This  difficulty  is  alleviated  somewhat  by  computing  good  estimates 
of  p0  and  p]_.  The  EM  algorithm  works  well  when  applied  to  this  problem.  In 
this  setting  apply  it  as  above,  but  do  not  use  it  to  update  the  estimate  of 
MixProp,  since  it  is  fixed  for  each  model.  Compute  new  estimates  of  p0  and 
Pl  only. 


Results  of  Testing  the  Algorithm 

The  algorithm  described  in  the  previous  section  was  programed  in  Turbo 
Pascal  and  applied  to  four  data  sets.  These  data  sets  were  generated  so  as 
to  have  certain  properties:  (1)  po=0.95,  pi=0.7,  MixProp=0.35 ,  strikingly 
bitaodal;  (2)  po=0.95,  pi=0.7,  MixProp=0. 75 ,  less  strikingly  bimodal;  (3) 
po=0.95,  pi=0.8,  MixProp=*0. 29,  not  bimodal  but  having  a  relatively  large 
variance;  and  (4)  po*0.9,  MixProp=l ,  a  single  population.  Graphs  of  these 
are  provided  in  Appendix  B.  Listings  of  the  programs,  with  notes,  are  provided 
in  Appendix  A.  One  hundred  sets  of  data  were  generated  for  each  of  these 
four  cases.  The  combined  AIC  and  EM  algorithms  were  used  to  estimate  K= 
MixProp'M,  the  number  sample  units  from  the  population  with  parameter  pQ. 

The  results  of  this  simulation  are  summarized  in  Table  2,  which  contains  the 
counts  of  the  number  of  times  estK  was  the  estimate  of  MixProp'M.  Throughout 
this  simulation  M=16  and  N=20. 


Table  2 

Frequency  Table  of  Estimated  M*MixProp 


estK 

(1) 

(2) 

(3) 

(4) 

1 

1 

0 

2 

0 

2 

4 

0 

6 

0 

3 

6 

1 

10 

0 

4 

6 

0 

8 

2 

5 

22 

3 

*9 

0 

Table  2  -  Continued 


6 

*16 

1 

3 

0 

7 

15 

0 

7 

0 

8 

12 

8 

7 

0 

9 

12 

8 

6 

1 

10 

4 

10 

3 

0 

11 

1 

19 

1 

0 

12 

0 

*13 

4 

0 

13 

0 

15 

0 

0 

14 

0 

12 

1 

0 

15 

0 

4 

3 

0 

16 

1 

6 

29 

*97 

(1)  po=0.95,  pi=0.70,  MixProp=0.35 

(2)  poT=0.95,  pi=0.70,  MixProp=0.75 

(3)  po=0.95,  p i=0. 80,  MixProp=0.29 

(4)  po=0.90,  MixProp=1.0 

An  *  marks  the  value  of  K=M*MixProp. 

If  there  were  no  variation  in  the  simulated  data  sets,  and  if  they  came 
from  the  prescribed  population  with  certainty,  and  if  the  algorithm  worked 
perfectly,  then  the  numbers  preceded  by  an  *  in  Table  2  would  be  100.  The 
algorithm  works  quite  well  in  cases  (1)  and  (2)  and  amazingly  well  in  case 
(4).  Case  (3)  and  case  (4)  show  the  disposition  of  AIC  to  favor  the  model  of 
a  single  population.  Case  (3)  is  generated  from  a  mixture  that  is  not  bimodal , 
but  has  a  large  variance. 

Consider  some  illustrations  from  case  (3): 

* 

* 

* 

*  *  * 

*  *  *  * 

****** 

100  95  90  85  80  75 

The  parameters  that  were  used  to  generated  this  data  were  po=0.95,  p^=0.8,  and 
K*4.64.  That  is  one  expects  the  letmost  5  data  points  to  be  from  one  population 
and  the  rightmost  11  to  be  from  another.  On  the  other  hand  the  estimated 


values  are  estpo=0.859  and  estK-16.  That  is  the  sample  is  from  a  single 
population  with  parameter  0.859.  The  problem  here  is  that  since  these  distri¬ 
butions  are  so  close,  there  is  a  significant  amount  of  swapping  and  masking, 
resulting  in  a  distribution  that  appears  to  be  that  of  a  single  population 
rather  than  that  of  a  mixture. 

Another  illustration  from  this  data  set  follows: 

*  * 

*  * 

*  *  *  *  * 

it  it  it  it  it  ★ 

100  95  90  85  80  75  70 

The  estimated  parameters  this  time  are  estpo=0.999,  estpi»0.793,  and  estK«l. 
That  is  one  of  the  data  points  at  100  is  from  one  population  and  the  remaining 
fifteen  are  from  another.  Of  course,  there  would  be  no  way  of  knowing  which  of 
the  two  test  panels  that  are  1002  unfoulded  belong  to  the  two  different  popula¬ 
tions.  One  would  either  treat  them  both  as  coming  from  a  population  distinct 
from  that  of  the  other  fourteen,  or  treat  the  entire  sample  of  sixteen  as 
coming  from  a  single  population. 

Another  illustration  from  this  data  set  follows: 

* 


* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

*  * 

* 

100 

95 

90 

85  80 

75 

The  estimated  parameters  are  estpo-0. 935 ,  estp^«0.792,  and  estK=10.  This  data 
set  was  generated  with  K«5.  That  is  the  expected  number  of  data  points  in 
the  leftmost  population  is  five,  not  the  estimated  ten.  Although  ten  seems 
more  reasonable,  than  five,  neither  reflect  the  possibility  that  the  two 


points  at  85  could  belong  to  the  leftmost  group 


This  algorithm  works  best  when  the  two  populations  are  widely  separated 
or  when  there  is  only  one  population.  This  is  not  surprising  since  swapping 
and  masking  are  less  important  when  the  populations  are  not  close.  When 
the  populations  are  distinctly  separated,  then  the  mixture  distributions  tend 
to  be  bimodal.  Hence,  one  observes  that  the  algorithm  works  best  separating 
the  populations  when  the  mixture  is  bimodal.  The  algorithms  excellent  per¬ 
formance  in  recognizing  a  single  population  can  be  attributed  to  the  fact 
that  AIC  is  adjusted  in  favor  of  selecting  simpler  models. 

CONCLUSIONS 

The  proposed  algorithm  works  well,  but  does  not  make  the  decision  for 
the  experimenter.  This  algorithm  should  be  used  as  a  decision  aid  and  not  as 
a  decision  rule.  One  should  keep  in  mind  the  problems  of  swapping  and  masking 
and  remember  that  the  purpose  of  the  analysis  is  to  decide  which  panels  might 
require  further  investigation  or  monitoring. 

The  algorithm  described  in  this  report  has  been  tested  only  on  simulated 
data.  It  should  be  tested  on  actual  data  that  is  well  understood.  That  is, 
it  should  be  tested  on  actual  field  data  that  apparently  comes  from  a  single 
population  and  on  data  that  seems  to  come  from  two  populations.  Finally,  it 
should  be  used  as  an  aid  in  making  interim  judgments  to  determine  if  it  is 
useful  for  that  task. 
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APPENDIX  A 
PROGRAM  LISTINGS 


This  appendix  contains  listings  of  the  following  program  and 
procedures : 

AICMIXBI.PAS  -  the  main  program; 

BINTABLE. PAS  -  a  procedure  for  computing  binomial  probabilities; 

MIXTABLE.PAS  -  a  procedure  for  computing  probability  mixtures; 

NEGLLIKE.PAS  -  a  procedure  for  computing  negative  log  likelihoods 

NEWP.PAS  -  a  procedure  for  computing  new  values  of  p0  and  pj_. 

program  AICMIXBIN; 

t  Approximates  the  parameters  of  a  mixture  of  two  binomials  via  the  AIC  and  EM 
algorithms.  | 

Label 

DataError ; 

Const 
N  *  20; 
ssize  =*  16; 
er  ■  0.0005; 

Type 

Sample  =  array [ 1 . .ssize]  of  integer; 

ProbTable  «  array [0..NJ  of  real; 

Out  -  record 

probO:  real; 
probl:  real; 
mixprob:  real; 

DataNum:  integer; 
end; 


Var 

p0,  pi,  mixprop,  oldpO,  oldpl,  newll,  aic,  rainaic  :  real; 
indx,  jndx,  start,  finis:  integer; 
cfO.  fO.  cfl.  fl.  cm.  m:  ProbTable: 


data:  Sample; 

DataFile:  file  of  sample; 
OutFile:  file  of  Out; 
OutData:  Out; 

t$I  B : B INTABLE . PAS  I 
|$I  B: MIXTABLE.PAS I 
i$I  b:negLLike.pas I 
I  SI  b:newp.pas  } 


(Listings  of  these  I 
(include  files  followl 
I  the  listing  of  the  I 
(main  program  I 


BEGIN 

assign( DataFile , * B: 95707516.DAT’ ) ;  reset( DataFile) ; 
assign(0utFile, ’B:95707516.aicf ) ;  rewrite  (OutFile); 
writeln( 'start, end') ; 


readln( start .finis) ; 
writeln( *  ' ) ; 
seek(datafile, start) ; 

repeat 

r ead ( Da  taFi 1 e , Da  ta ) ; 
for  indx:-l  to  ssize  do 
begin 

if  (Data[ indx]<0)  or  (Data{ indx] >N)  then 
begin 

writeln( 'Error  in  the  range  for  the  data  in  record', 
FilePos(Dataf ile)-l ) ; 
goto  DataError; 
end; 
end ; 

for  indx:-  1  to  ssize  do 
begin 

I  Initial  estimates) 

mixprop : -indx/ ssize  ; 
p0:»0.0; 

for  jndx:-l  to  indx  do  pO:  =pO+Data[  jndx] ; 

pO: =pO/N/indx; 

pl:-0.0; 

for  jndx:-indx+l  to  ssize  do  pl:-pl+Data[ jndx] ; 
if  ssize-indx  then 
pl:-0.5 
else 

pi: -pl/N/( ssize- indx) ; 
repeat 

oldpO:«pO;  oldpl:»pl; 

BinTable(pO,N,cfO,fO); 

BinTable(pl,N,cf l.f 1); 

MixTable( mix prop, 0, N.cfO ,fO,cf l,fl,cm,m); 

Newp( f 0,f 1 ,m .Data , ssize ,N,pO,p 1 ) ; 
until  ( abs(p0-oldp0)<er)  and  ( abs( pl-oldpl )<er) ; 

negllike(m, Pita, ssize, newll) ; 
aic : »2*newxx+4; 

if  indx-ssize  then  aic:-aic-2; 
if  indx»l  then  minaic:»aic; 
if  aic  O  minaic  then 
begin 

rainaic :-aic; 
with  Out  Data  do 
begin 
prob0:-p0; 
probl : -pi; 
mixprob : -mixprop ; 

DataNum :-FilePos( DataFile)-! ; 


end ; 

t  wr ite( Out Fi le , Out  Data) ; | 

writeln( 1st ,OutData.probO: 7 : 3,OutData.probl : 7: 3, Out  Data. mixprob  :7 
: 3, 16*Outdata. mixprob: 5: 1 , Out Data. DataNum:6 ) ; 
writeln( 1st , ’  ’);(forra  feed) 

writeln(OutData.probO: 7 : 3,OutData.probl : 7 : 3 , Out Data .mixprob : 7 : 3, 
16*Outdata .mixprob: 3:1, 0utData.DataNum:6) ; 

DataError:  ; 

| until  eof(DataFile); t 
until  FilePos(Datafile)-finis ; 

flush(OutFile) ; 
close(OutFile) ; 
close(DataFile) ; 

END. 

procedure  BinTable(p:  Real;  N:  integer; 

var  CumProbFunc,  ProbFunc:  ProbTable); 

| Procedure  to  compute  the  cumulative  probability  function  and  the 
probability  function  of  a  Binomial  distribution. I 

Var 

lndx:  integer; 

lnprob :  array[0..20J  of  real; 
prob,  q:  real; 

begin 

q:-l-p; 

prob:»l; 

if  p>-1.0  then 
beg  in 

for  indx:-  0  to  n-1  do 
begin 

CumProbFunc [ indx J :  -0 . 0; 

ProbFuncfN] :-1.0; 
end ; 

if  p<-0.0  then 
begin 

for  indx:-  1  to  no  do 
begin 

CumProbFunc [ indx] :-1.0; 

ProbFunc] indx] : -0. 0; 
end ; 

CumProbFunc [0 ] :-; / 0; 

ProbFunc[0] :-1.0; 
end ; 


if  (p<0.0)and(p<1.0)  then 
begin 

lnprob[0 ] : -N*ln( q) ; 
for  indx:»l  to  N  do 
begin 

lnprob[ indx] : -lnprob] indx-l]+n( p)-ln( q)+ln( N-indx+1)- 
ln( indx) ; 
end ; 

for  indx:-0  to  N  do  Probfunc] indx] : -exp( lnprob] indx] ) ; 

for  indx:-0  to  No  do  CumProbFunc] indx] : -0. 0; 

CumProbFunc [ 0 ] : -ProbFunc [ 0 ] : 

for  indx:-l  to  No  do  CumProbFunc] indx] : -CumProbFunc] indx-l]+ProbFunc[ indx[ ; 

CumProbFunc [N] 1 • 0; 
end ; 
end : 

procedure  MixTable  (mixproprreal;  LowRange,  UpRange rinteger ; 

CumProbf uncO,  ProbFuncO,  CumProbFuncl ,  ProbFuncl: 

ProbTable;  var  MixCumProbFunc ,  MixProbFunc:  Prob: Table); 
t Computes  the  cumulative  probability  function  and  the  probability  function 
of  a  mixture. | 

Var 

indx :  integer ; 
begin 

for  indix : -LowRange  to  UpRange  do 
begin 

MixProbFunc [ indx] :«mixprob*PropFuncO(indx]  +  (l-mixprop)*ProbFuncl[ indx] ; 
MixCumProbFunc] indx] :»mixprop*CumProbFuncO] indx]  +  ( l-mixprop)*CumProb 
Fund]  indx] ; 

end; 
end ; 

procedure  negllike( density: ProbTable ; data : Sample ;SSize rinteger ; var  NegLnLike : 
real) ; 

I  Computes  the  negative  of  the  log  likelihood  for  the  density  function.] 

Var 

indx:  integer; 
begin 

NegLnLike  :-0.0; 

for  indx:-  1  to  SSize  do  NegLnLike :-NegLnLike  -  ln(density]data[ indx] ] ; 
end; 


procedure  Newp( DensityO,  Densityl,  MixDensity : ProbTable ; 

Data:  Sample; 
sslze.  M:  integer; 
var  pO,  pi:  real); 

|  Uses  formula  from  EM  algorithn  to  compute  new  values  of  pO  and  pi. I 
var 

MO,  Ml:  real; 
indx:  integer; 

begin 

p0:=0;  PI :-0; 

M0:**0;  Ml:-0: 
for  indx:=l  to  ssize  do 
begin 

MO:  =*M0+Densi  tyOfData]  indx]  ]  /MixDensity] Data]  indx]  ] ; 

Ml :-Ml+Densityl[Data] indx] j /MixDensity]Data( indx] ] ; 
end; 

for  indx:*l  to  ssize  do 
begin 

pO:pO+Data( indx] *DesnityO(Data] indx] ] /MixDensity] Data] indx] ] ; 
pi :pl+Data[ indx] *Desnityl[ Data] indx] ] /MixDensity] Data ] indx] ] ; 
end; 

pO:+pO/N/MO; 
pi :+pl/N/Ml ; 
end; 

Note  1:  Variables  of  type  ProbTable  that  begin  with  the  letter  "c"  are 

cumulative  probability  functions  and  are  not  needed  for  the  present 
analysis. 

Note  2:  In  procedure  BINTABLE. PAS,  the  binomial  probabilities  are  computed 
using  logarithms,  because  computing  them  in  the  usual  way  causes  an 
overflow  error  due  to  the  fact  that  pQ  or  p^  could  be  very  close  to 
one  or  zero. 
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APPENDIX  B 

GRAPHS  OF  MIXTURES  OF  BINOMIALS 


This  appendix  contains  graphs  of  the  probability  functions. 

MixProp*bin(x;p0N)  +  ( 1-MixProb) *bin(x;p^ ,N) 
for  the  following  four  cases: 


(1) 

po”0 . 95 , 

P^-0.70,  MixProp*0. 35 

(2) 

po-0 . 95 , 

Pl“0.70,  MixProp-0.75 

(3) 

po»0 . 95 , 

P^-0.80,  MixProp-0.29 

(4) 

Po-0. 90, 

MixProp*1.0 

and  where 

bin(x;p,N)  *■  ( N! / [ x! ( N-x) ! ] ) px( 1-p) n-x  for  x*0,....,N 
is  the  binomial  probability  function. 
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